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ABSTRACT 



T-DNA tagging with a promoterlcss 0-giucuronidase (GUS) gene 
generated a transgenic Nicotiana tabacum plant that expressed GUS 
activity only in developing seed coats. Cloning and deletion analysis of the 
GUS fusion revealed that the promoter responsible for seed coat 
specificity was located ia the plant DNA proximal to the GUS gene. 
Analysis of the region demonstrated that the seed coat-specificity of GUS 
expression in this transgenic plant resulted from T-DNA insertion next to a 
cryptic promoter. This promotor is useful in controlling the expression of 
genes to the developing seed coat in plant seeds. 
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A SEED COAT-SPECIFIC CRYPTIC PROMOTER IN TOBACCO 

Field f Invention 

This invention relates to a cryptic promoter identified from 
Nicotiana tabacum (tobacco). Specifically this invention relates to a seed 
coat-specific cryptic promoter isolate i from tobacco. 

Background and Prior Art 

Bacteria from the genus Agrttbacti nvm have the ability to transfer 
specific segments of DNA (T-DNA) to pi. ot cells, where they stably 
integrate into the nuclear chromosomes. / nalyses of plants harbouring the 
T-DNA have revealed that this genetic elen, »nt may be integrated at 
numerous locations, and can o~asionally be founa within genes. One 
strategy which may be exploited to identify integration events within genes 
is to transform plant cell- with specially designed T-DNA vectors which 
contain a reporter gene, devoid of cfr-acting transcriptional and 
translational expression signals (i.e. promote rless), located at the end of 
the T-DNA. Upon integration, the initiation code of the promoterless 
gene (reporter gene) will be juxtaposed to plant s« .uences. The 
consequence of T-DNA insertion adjacent to, and '.ownstxeam of, gene 
promoter elements may be the activation of repor -r gene expression. The 
resulting hybrid genes, referred to as T-DNA-mec ited gene fusions, 
consist of unknown and Urns un-charactcrized plaiu promoters residing at 
their natural location within the chromosome, and the coding sequence of 
a marker gene located on the inserted T-DNA (Fobert et al., 1991, Plant 
Mol. Biol. 17, 837-851). 

It has generally been assumed that activation of promoterless or 
enbancerless marker genes result from T-DNA insertions within or 
immediately adjacent to genes. The recent isolation of several T-DNA 
insertional mutants ^Koncz et ai^ 1992, Plant Mol Biol 20, 963-976; 
reviewed in Feldmann, 1991, Plant J. 1, 71-82; Van Lijsebettens et al.. 
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1991, «anr Sci 80, 27-37; Walden «/ al., 1991, /Van/ 7. 1: 281-288; 
Yanofcky et al., 1990, Nature 34* 35-39), shows that this is the case for at 
least some insertions. However, other possibilities exist One of these is 
that integration of the T-DNA activates silent regulatory sequences that 
5 are not associated with genes, lindsey et al. (1993, Transgenic Res. 2, 33- 

47) referred to such sequences as "pseudo-promoters" and suggested that 
they may be responsible for activating marker genes in some transgenic 
lines. 

10 Inactive regulatory sequences that are buried in the genome but 

with the capability of being functional when positioned adjacent to genes 
have been described in a variety of organisms, where they have been 
called "cryptic promoters" (Al-Shawi et al^ 1991, Mol CelL Biol 11, 4207- 
4216; Fourel et al., 1992, Mol Cell Biol 12, 5336-5344; Irniger et al., 1992, 

15 Nucleic Acids Res. 20, 4733-4739; Takahashi et al., 1991, Jpn J. Cancer Res. 

82, 1239- 12-V4). Cryptic promoters can be found in the introns of genes, 
such as those encoding for yeast actin (Irniger et at., 1992, Nucleic Acids 
Res. 20, 4733-4739), and a mammali an melanoma-associated antigen 
(Takahashi et al., 1991, Jpn J. Cancer Res. 82, 1239-1244). h has bsen 

20 suggested that the cryptic promoter of the yeast actin gene may be a relict 

of a promoter that was at one tune active but lost function once the coding 
region was assimilated into the exon-intron structure of the preseEt-day 
gene (Irniger et al., 1992, Nucleic Acids Res. 20, 4733-4739). A crypuc 
promoter has also been found in an untranslated region of the second 

25 exon of the woodchuck N-myc proto-oncogene (Fourel et al., 1992, MnL 

CelL Bid 12, 5336-5344). This cryptic promoter is responsible for 
activation of a N-myc2, a functional processed gene which arose from 
retropositon of N-m>c transcript (Fourel et al., 1992, Mol Cell BioL 12, 
5336-5344). These types of regulatory sequences have not yet been 

30 isolated from plants. 
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This patent application describes, as an example, one transgenic 
plant, T218, generated by tagging with a promoterless GUS (0- 
glucuronidase) T-DNA vector. This plant is of particular interest in that 
GUS expression was spatially and developmentally regulated in seed coats 
and a promoter specific to this tissue has not been previously isolated. 
Cloning of the insertion site uncovered a cryptic promoter within a region 
of the tobacco genome not conserved among related species. This seed 
coat-specific promoter can be useful for controlling gene expression of 
selected genes to a specific stage of development 

Summary of Invention 

The present invention is directed to a cryptic promoter identified 
from Nicotiana tabacum (tobacco). Specifically this invention relates to a 
seerl coat-specific cryptic promoter isolated from tobacco. 

The transgenic tobacco plant, T218, contained a 4.7 kb EcoRl 
fragment containing the 22 kb promoterless GVSnos gene and 2.5 kb of 
5' flanking tobacco DNA. Deletion of the region approximately between 
2^ and 1.0 kb of the 5* flanking region did not alter GUS expression, as 
compared to the entire 4.7 kb GUS fusion. A further deletion to 0.5 kb of 
the 5* flanking site resulted in complete lose of GUS activity. Thus the 
region between 1.0 and 0.5 of the 5* flanking region of the tobacco DNA 
contains the elements essential to gene activation. This region is 
contained within a Xbal - SnaBl restriction site fragment of the flanking 
tobacco DNA. 

Thus according to the present invention there is provided a seed 
coat-specific cryptic promoter in tobacco contained within a DNA 
sequence, or analogue thereof, as shown in Figure 6. 



Further according to the present invention, there is provided a 
DNA sequence, or analogue thereof, as shown in Figure 6. 

This invention also relates to a cloning vector containing a seed 
coat-specific cryptic promoter from tobacco, which is contained within a 
DNA sequence, or analogue thereof, as shown in Figure 6 and a gene 
encoding a protein. 

This invention also includes a plant cell which has been transformed 
with a cloning vector as described above 

This invention further relates to a transgenic plant containing a 
seed-coat specific promoter, operatively linked to a gene encoding a 
protein. 

Brief Description of the Drawings 

F ; jure 1 depicts the fluorogenic analyses of GUS expression in the 
plant T218. Each bar represents the average ± one standard deviation of 
three samples. Nine different tissues were analysed: leaf (L), stem (S), 
root (R), anther (A), petal (P), ovary (O), sepal (Se), seeds 10 days post 
anthesis (SI) and seeds 20 days post-an thesis (S2). For all measurements 
of GUS activity, the fraction attributed to intrinsic fluorescence, as 
determined by analysis of untransfonned tissues, is shaded black on the 
graph. Absence of a black area at the bottom of a histogram indicates 
that the relative contribution of the background fluorescence is too small 
to be apparent 

Figure 2 shows the cloning of the GUS fusion in plant T218 
(pT218) and construction of transformation vectors. Plant DNA is 
indicated by the solid line and the promoterless GUS-nor gene is indicated 
by the open box. The transcriptional start site and presumptive TATA box 



are located by the closed and open arrow heads respectively. DNA probes 
#1, 2, 3 and RNA probe #4 are shown. The EcoRI fragment in pT218 
was subcloned in the pBENl9 poly linker to create pT218-l. Fragments 
truncated at the Xbal SnaBl and Xbal sites were also subcloned to create 
pT218-2, pT218-3 and pT218-4. Abbreviations for the endonuclease 
restriction sites are as follows: EcoRI (E). HindUI (H), Xbal (X), SnaBl 
(N). Smal (M), Sstl (S). 

Figure 3 shows the expression pattern of promoter fusions during 
seed development GUS activity in developing seeds (4-20 days 
postanthesis (dpa)) cf (Fig. 3a) plant T218 (•-•) and (Fig. 3b) plants 
transformed with vectors pT218-l (OO), pT218-2 (□-□), pT218-3 (V-V) and 
pT218-4 (A-A) which are illustrated in Figure 2. The 2 day delay in the 
peak of GUS activity during seed development, seen with the pT218-2 
transforraant, likely reflects greenhouse variation conditions. 

Figure 4 shows GUS activity in 12 dpa seeds of independent 
transformants produced with vectors pT218-l (o), pT218-2 (□), pT218-3 
(V) and pT21&4 (A). The solid markers indicate the plants shown in 
F ; gure 3 (b) and the arrows indicate the average values for plants 
transformed with pT218-l or pT218-2. 

Figure 5 shows the mapping of the T218 GUS fusion tenruni and 
expression of the region surrounding the insertion site in untransformed 
plants. 

(Fig. 5a) Mapping of the GUS mRNA termiai in pJant T218. 

The antisense RNA probe from subclone #4 (Figure 
2) was used for hybridization with total RNA of 
tissues from un transformed plants (10 ng) and from 
plant T218 (30 jig). Arrowheads indicate the 
anticipated position of protected fragments if 



transcripts were initiated at the same sites as the T218 
GUS fusion. 

(Fig. 5b) RNase protection assay using the antisense (relative 
to the orientation of the GUS coding region) RNA 
probe from subclone e (Figure 7) against 30 Mg total 
RNA of tissues from un transformed plants. 

P, untreated RNA probe; % control assay using the probe and tRNA 
only; L> leaves from untransformed plants; 8, 10, 12, seeds from 
untransformed plants at 8, 10, and 12 dpa, respectively; T10, seeds of plant 
T218 at 10 dpa; + , control hybridization against unlabeled in vitro* 
synthesized sense RN/ from subclone c (panel a) or subclone e (panel b). 
The two hybridizing bands near the top of the gel are end-labeled DNA 
rragment of 3313 and 1049 bp, included in all assays to monitor losses 
(hiring processing. Molecular weight markers arc in number of bases. 

Figure 6 provides the nucleotide sequence of pT2l8 (top line) and 
pIS-1 (bottom line). Sequence identity is indicated by dashed lines. The 
T-DNA insertion site is indicated by a vertical line after bp 993. This site 
on pT218 is immediately followed by a 12 bp filler DNA, which is followed 
by the T-DNA The first nine amino acids of the GUS gene and the GUS 
initiation codon (*) are shown. The major and minor transcriptional start 
site is indicated by a large and smail arrow, respectively. The presumptive 
TATA box is identified and is in boldface. Additional putative TATA and 
CAAT boxes are marked with boxes. The location of direct (1-5) and 
indirect (6-8) repeats are indicated by arrows. 

Figure 7 shows the base composition of region surrounding the 
T218 insertion site cloned from untransformed plants. The site of T-DNA 
insertion in plant T218 is indicated by the vertical arrow. The position of 
the 2 genomic clones pIS-1 and pIS-2, and of the various RNA probes (&- 
e) used in RN&^e protection assays are in<licated beneath the graph. 



Figure 8 shows the Southern blot analyses of the insertion site in 
Nicoriana species. DNA from N. tomentosiformis (N torn), N. sytvestris (N 
syl), and N. tabacum (N tab) were digested with Hindttl (II), Xbal (X) and 
EcoRl (E) and hybridized using probe #2 (Figure 2). Lambda //in dm 
markers (kb) are indicated. 

Figure 9 shows the AT content of 5* non-coding regions of plant 
genes. A program was written in PASCAL to scan GenBank release 75.0 
and to calculate the AT contents of the 5' non-coding (solid bars) and the 
coding regions (hatched bars) of all plant genes identified as 
"Magnoiiopbyta" (flowering plants). The region -200 to -1 and + 1 to +200 
were compared. Shorter sequences were also accepted if they were at 
least 190 bp long. The horizontal axis shows the ratio of the AT content 
(%). The vertical axis shows the number of the sequences having the 
specified AT content ratios. 



Detailed Description of the Preferred Embodiments 

T-DNA tagging with a promoterless ^-glucuronidase (GUS) gene 5 
generated a transgenic Nicotiana tabacum plant that expressed GUS 
activity only in developing seed coats. Cloning and deletion analysis of the- 
GUS fusion revealed that the promoter responsible for seed coat 
specificity was located in the plant DNA proximal to the GUS gene. 
Deletion analyses localized the cryptic promoter to an approximately OS 
kb region between a Xbal and a SnaBl restriction endonuclease site of the 
5' flanking tobacco DNA, This region spans from nucleotide 1 to 
nucleotide 467 as shown in Figure 6. 

Thus, the present invention includes a DNA sequence comprising 
the seed coat-specific cryptic promoter from tobacco end analogues, 
thereof. Analogues of the cryptic promoter include any substitution. 



deletion, or additions of the region, provided that said analogues maintain 
the seed coat- specific expression activity. 

The terra cryptic promoter means a promoter that is not associated 
with «: Z'"~ and thus does not control expression in its native locati< 
These inactive regulatory sequences are buried in the genome but arc 
capable of being functional when positioned adjacent to a gene, 

The DNA sequence of the present invention thus includes the DNA 
sequence of as shown in Figure 6, the promoter region within the sequence 
as shown in Figure 6 (for example from nucleotide 1 to 476), and 
analogues thereof. Analogues include those DNA sequences which 
hybridize under stringent hybridizalion conditions (see Maoiatis et al. f in 
Molecular Cloning (A Laboratory Manual), Cold Spring Harbor 
Laboratory, 1982, p. 387-389) to the DNA sequence as shown in Figure 6, 
provided that said sequences maintain the seed coat-specific promoter 
activity. An example of one such stringent hybridization conditions may oe 
hybridization at 4XSSC at 65°C, followed by washing in 0.1XSSC at 65°C 
for an hour. Alternatively an exemplary stringent hybridization condition 
could be in 50% formamide, 4XSSC at 42°C Analogues also include 
those DNA sequences which hybridize to the sequence as shown in Figure 
6 under relaxed hybridization conditions, provided that said sequences 
maintain the seed coat-specific promoter activity. Examples of such non- 
hybridization conditions includes hybridization at 4XSSC at 50°C or with 
30-40% formamide at 42°C 

There are several lines of evidence that suggest that the seed coat- 
specific expression of GUS activity in the plant T218 is --gulated by a 
cryptic promoter. The region surrounding the promoter and 
transcriptional start site for the GUS gene are not transcribed in 
un transformed plants. Transcription was only observed in plant T218 
when T-DNA was inserted in cis. DNA sequence analysis did not uncover 



a long open reading frame within the 3 J kb region cloned. Moreover, the 
region is very AT rich and predicted to be noncoding (data not shown) by 
the Fickett algorithm (Fickett, 1982, Nucleic Acids Res. 10, 5303-5318) as 
implemented in DNASIS 7-0 (Hitachi). Southern blots revealed that the 
insertion site is within the N. tomentosiformis genome and is not conserved 
among related species as would be expected for a region with an important 
gene. 

As this is the first report of a cryptic promoter in plants, it is 
impossible to estimate the degree to which cryptic promoters may 
contribute to the high frequencies of promoterless marker gene activation 
in plants. It is interesting to note that transcriptional GUS fusions in 
Arabidopsis occur at much greater frequencies (54%) than translaD'onal 
fusions (1.6%. Kertbundit et al^ 199 l t Proa NatL Acad. ScL USA 88, 
5212-5216). The possibility ti:at cryptic promoters may account for soed 
fusions was recognized by Lindsc^y et al. (1993, Transgenic Res. 2, 33-47). 

The results disclosed herewith confirms others (Gheysen et a/., 1987, 
Proa NatL Acad ScL USA 84. 6169-6173 and 1991, Genes Dev. 5, 287-297) 
that T-DNA may insert into A-T rich regions as do plant transposable 
elements (Capel et al^ 1993 t Nucleic Acids Res. 2l t 2369-2373). We 
illustrate that promoters of plant genes are also A-T rich raising 
speculation that gene insertions into these regions could facilitate the rapid 
acquisition of new regulatory elements during gene evolution. 

The insertion of functional genes into the nuclear genome and 
acquisition of new regulatory sequences has already played a major role in 
the diversification of certain genes and the endosymbiosis of organelles, in 
plants, most organella proteins are nuclear encoded due to the ocgoLng 
transfer of their g.-ncs into the nucleus (Palmer, 1991, In Bogorad L and 
Vasil DC (eds) The Molecular Biology of Plastids, Academic Press, San 
I>'ego # pp 5-53). Recently, it has been shown that the cox 2 gene of 
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cowpca (Nugent and Palmer, 1991, Cell 66, 473-481) and soybean (Coveilo 
and Gray, 1992, LMBO I 11, 3815-3820) were transferred from 
mitochondria to nucleus without promoters by RNA intermediates. The 
results disclosed herewith, with T-DNA-tnediated gene fusions reveal the 
facility with which promoters can be acquired by incoming genes. The 
presence of cryptic promoters and diverse regulatory elements in the 
intergenic regions may insure that genes rapidly achieve the features 
needed to meet the demands of complex multicellular organisms. 

The cryptic promoter of the present invention can also be used to 
coutrol to the expression of any given gene spatially and developmentally 
to developing seed coats. Some examples of such uses, which are not to 
be considered limiting, include: 

1. Modification of storage reserves in seed coats, such as starch 
by the expression of yeast invertase to mobilize the starch or 
expression of the antisense transcript of ADP-glucose 
pyrophosphorylase to inhibit starch biosynthesis. 



2. Modification of seed color contributed by condensed tannins 
in the seed coats by expression of antisense transcripts of the 
phenylalanine ammonia lyase or chaicone synthase genes. 

3. Modification of fibre content in seed-derived meal by 
expression of antisense transcripts of the caffeic acid-o~ 
methyl transferase or dnnamoyl alcohol dehydrogenase 
genes. 

4. Inhibition of seed coat maturation by expression of 
ribonuclease genes to allow for increased seed size, and to 
reduce the relative biomass of seed coats, and to aid in 
dehulling of seeds. 
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5. Expression of genes in seed coats coding for insecticidal 
proteins such as a-amytese inhibitor or protease inhibitor. 

6. Partitioning of seed metabolites such as giucosinolates into 
seed coats for nematode resistant. 

Thus this invention is directed to such promoter and gene 
combinations. Further this invention is directed to such promoter and 
gene combinations in a cloning vector, wherein the gene is under the 
control of the promoter and is capable of being expressed in a plant cell 
transformed with the vector. This invention further relates to transformed 
plant cells and transgenic plants regenerated from such plant cells. The 
promoter and promoter gene combination of the present invention can tx 
used to transform any plant cell for the production of any transgenic plar'_ 
15 The present invention is not limited to any plant species. 

While this invention is described in detail with particular reference 
to preferred embodiments thereof, said embodiments are offered to 
illustrate but not limit the invention. 



10 



20 



30 



EXAMPLES 



Characterization or a Seed Coat-Specific GUS Fusion 

Transfer of binary constructs to Agrobacterium and leaf disc 
25 transformation of Nicotiana tabaatm SRI were performed as described by 

Fobert et al. (1991, Plant MoL Biol 17, 837-851). Plant tissue was 
maintained on 100 Mg/ml kanamycin sulfate (Sigma) throughout in vitro 



culture. 



Nine-hundred and forty transgenic plants were produced. Several 
hundred independent transformants were screened for GUS activity in 
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dcvcloping seeds using the fluorogenic assay. One of these, T218, was 
chosen for detailed study because of its unique pattern of GUS expression. 

Fluorogenic and histological GUS assays were performed according 
to Jefferson {Plant MoL BioL Rep., 1987, 5 f 387-405), as modified by 
Fobert et al. (Plant MoL BioL, 1991, 17, 837-851). For initial screening, 
leaves were harvested from in vitro grown plantlets. Later flowers 
corresponding to developmental stages 4 and 5 of Koltunow et al. (Plant 
Cell. 1990, 2, 1201-1224) and beige seeds, approximately 12-16 dpa (Chen 
et al., 1988, EMBO J. 7, 297-302), were collected from plants grown in the 
greenhouse. For detailed, quantitative analysis of GUS activity, leaf, stem 
and root tissues were collected from kanamycin resistant Fl progeny of the 
different transgenic lines grown in vitro. Floral tissues were harvested at 
developmental stages 8-10 (Koltunow et al. y 1990, Plant Cell 2, 1201-1224) 
from the original transgenic plants. Flowers of these plants were also 
tagged and developing seeds were collected from capsules at 10 and 20 
dpa. T n all cases, tissue was weighed, immediately frozen in liquid 
nitrogen, and stored at -80°C. 

20 Tissues analyzed by histological assay wer^ at the same 

developmental stages as those listed above. Different hand-cut sections 
were analyzed for each organ. For each plant, histological assays were 
performed on at least two different occasions to ensure reproducibility. 
Except for floral organs, all tissues were assayed in phosphate buffer 

25 according to Jefferson (1987, Plant MoL BioL Rep. 5, 387-405), with 1 mM 

X-Gluc (Sigma) as substrate. Flowers were assayed in the same buffer 
containing 20% (v/v) methanol (Kosugi et al., 1990, Plant Sci 70, 133-140), 

30 Tissue-specific patterns of GUS expression were only found in 

seeds. For instance, GUS activity in plant T218 (Figure 1) was localized in 
seeds from 9 to 17 days postauthesis (dpa). GUS activity was nov detected 
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in seeds at other stages of development or in any other tissue analyzed 
which included leaf, stem, root, anther, ovary, petal and sepal (Figure 1). 
Histological staining with X-Gluc revealed that GUS expression in seeds at 
14 dpa was localized in seed coats but was absent from the embryo, 
endosperm, vegetative organs and floral organs (results not shown). 

The seed coat-specificity of GUS expression was confirmed with the 
more sensitive fluorogenic assay of seeds derived from reciprocal crosses 
with un transformed plants. The seed coat differentiates from maternal 
tissues called the integuments which do not participate in double 
fertilization (Esau, 1977, Anatomy of Seed Plants. New York: John Wiley 
and Sons). If GUS activity is strictly regulated, it must originate from GUS 
fusions transmitted to seeds maternally and not by pollen- As shown in 
Table 1, this is indeed the case. As a control, GUS fusions expressed in 
embryo and endosperm, which are the products of double fertilization, 
should be transmitted through both gametes. This is illustrated in Table 1 
for GUS expression driven by the napin promoter (BngNAPI, Baszczynki 
and Fallis, 1990, Plant MoL BioL 14, 633-635) which is active in both 
embryo and endosperm (data not shown). 
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Table 1. GUS activity in seeds at 14 days post anthesis. 



v 



CA IS Activity 

nuiole MU/min/mg Protein 



T218 
T218 
WT 
WT 



MAP-5" 

NAP-5 

WT 



T218 

WT* 

T218 

WT 

HAP-5 

WT 

NAP-5 



1.09 ± 039 
3.02 ± 0.19 
0.04 ± 0.005 
0.04 ± 0.005 
14.6 ± 7.9 
3.42 ± 1.60 
2.91 ± 1.97 



• WT, un transformed plants 

J Transgenic tobacco plants with the GUS geue fused to the 

napin, BngNAPl, promoter (Baszczynski and Fallis, 1990, Plant 
Mol Biol 14, 633-635). 

Cloning and Analysis of the Seed Coat-Specific GUS Fusion 

Genomic DNA was isolated from freeze-dried leaves using the 
protocol of Sanders et al. (1987, Nucleic Acid Res. 15, 1543-1558). Ten 
micrograms of T218 DNA was digested for several hours with EcoRl using 
the appropriate manufacturer-supplied buffer supplemented with 25 mM 
spermidine. After electrophoresis through a 0.8% TAE agarose gel, the 
DNA size fraction around 4-6 kb was isolated, purified using the 
GensClean kit (BIO 101 Inc., LaJolla, CA), ligated to phosphatase-treated 
JEcoRl-digested Lambda GEM-2 anus (Promega) and packaged in vitro as 
suggested by the supplier. Approximately 125,000 plaques were 
transferred to nylon filters (Nytran, Schleicher and Schuell) and screened 
by plaque hybridization (Rutledge et al^ 1991, Mol Gen. Genet. 229, 
31-40), using the 3' (termination signal) of the nos gene as probe (probe 
#1, Figure 2). This sequence, contained in a 260 bp Sstl/EcoBI restriction 
fragment from pPRF-101 (Fobert et ol^ 1991, Plant Mol Biol. 17, 837-851), 
was labeled with [a-^J-dCTP (NEN) using random priming (Stratagene). 
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After plaque purification, phage DNA was isolated (Sambrook et o/ n 1989, 
A Laboratory Manual. New York: Cold Spring Harbor Laboratory Press), 
mapped and subcloned into pGEM-4Z (Promega). The EcoRl fragment 
and deletions shown in Figure 2 were inserted into p BEN 19 (Be van, 1984, 
NucL Acid Res. 12, 8711-8721). Restriction mapping was used to 
determine the orientation of the fusion in P BIN19 and to confirm plasmid 
integrity. Plants were transformed with a derivative which contained the 5* 
end of the GUS gene distal to the left border repeat This orientation is 
the same as that of the GUS gene in the binary vector pBHOl (Jefferson, 
1987, Plant MoL BioL Rep. 5, 387-405). 

The GUS fusion in plant T218 was isolated as a 4.7 kb £coRI 
fragment containing the 2J2kb promoterless GUS-nor gene at the T-DNA 
border of P PRF120 and 2J kb of 5* flanking tobacco DNA (pT218, Figure 
2), using the nos 3' fragment as probe (probe #1, Figure ?). To confirm 
the ability of the flanking DNA to activate the GUS coding region, the 
entire 4.7 kb fragment was inserted into the binary transformation vector 
P BEN19 (Bevan, 1984, NucL Acid Res. 12, 8711-8721), as shown in Figure 
2. Several transgenic plants were produced by Agrobacterium-medined 
transformation of leaf discs. Southern blots indicated that each plant 
contained 1-4 T-DNA insertions at unique sites. The spatial patterns of 
GUS activity were identical to that of plant T218. Histologically, GUS 
staining was restricted to the seed coats of 14 dpa seeds and was absent in 
embryos and 20 dpa seeds (results not shown). Fluorogenic assays of GUS 
activity in developing seeds showed that expression was restricted to seeds 
between 10 and 17 dpa, reaching a maximum at 12 dpa (Figure 3 (a) and 3 
(b)). The 4.7 kb fragment therefore contained all of the elements required 
for the tissue-specific and developmental regulation of GUS expression. 

To locate regions within the flanking plant DNA responsible for 
seed coat-specificity, truncated derivatives of the GUS fusion were 
generated (Figure 2) and introduced int tobacco plants. Deletion of the 
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region approximately between US and 1.0 kb, 5' of the insertion site 
(pT218-2, Figure 2) did not alter expression compared with the entire 4.7 
kb GUS fusion (Figures 3b and 4). Further deletion of the DNA, to the 
SnaBl restriction site approximately 0.5 kb, 5' of the insertion site (pT2l8- 
3, Figure 2), resulted in the complete loss of GUS activity in developing 
seeds (Figures 3b and 4). This suggests that the region approximately 
between 1.0 and 05 kb. 5* of the insertion site contains elements essential 
to gene activation. GUS activity in seeds remained absent with more 
extensive deletion of plant DNA (pT218-4, Figures 2, 3b and 4) and was 
not found in other organs including leaf, stem, root, anther, petal, ovary or 
sepal from plants transformed with any of the vectors (data not shown). 



The transcriptional start site for the GUS gene in plant T218 was 
determined by RNase protection assays with RNA probe #4 (Figure 2) 

15 which spans the T-DNA/plant DNA junction. For RNase protection 

assays, various restriction fragments from pIS-1, pIS-2 and pT2l8 were 
subcloned into the transcription vector pGEM^Z as shown in Figures 7 
and 2, respectively. A 440bp //wdffl fragment of the tobacco 
acetobydroxyacid synthase SURA gene was used to detect SURA and 

20 SURB mRNA. DNA templates were linearized and transcribed in vitro 

wit either T7 or SP6 polymerases to generate strand-specific RNA probes 
using the Fromega transcription kit and [a-^PlCTP as labeled nucleotide. 
RNA probes were further processed as described in OueUet et a!. (1992, 
Plant J. 2, 321-330). RNase protection assays were performed as described 

25 in OueUet et al., (1992, Plant J. 2, 321-330), using 10-30 ng of total RNA 

per assay. Probe digestion was done at 30°C for 15 min using 30 Mg ml 1 
RNase A (Boehringer Mannheim) and 100 units ml" 1 RNase Tl 
(Boehringer Mannheim). Figure 5 shows that two termini were mapped in 
the plant DNA. The major 5' terminus is situated at an adenine residue, 
30 122 bp upstream of the T-DNA insertion site (Figure 6). The sequence at 

thi£ transcriptional start site is similar to the consensus sequence for plant 
genes (C/TTC a ATCA; Joshi, 1987 Nucleic Acids Res. 15, 6643-6653). A 
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TATA box consensus sequence is present 37 bp upstream of this start site 
(Figure 6). The second, minor terminus mapped 254 bp from the insertion 
site in an area where no obvious consensus motifs could be identified 
(Figure 6). 

The tobacco DNA upstream of the insertion site is very AT-rich 
(>75%. see Figure 7). A ^arch for promoter-like motifs and scaffold 
attachment regions (SAR), which are often associated with promoters 
(Breyne et a/, 1992, Plant Cett 4, 4f *471; Gasser and Laemmli, 1986. Cell 
46, 521-530). identified several putative regulatory elements in the first 1.0 
kb of tobacco DNA flanking the promoterless GUS gene (da»a not shown). 
However, the functional significance of »hese sequences remains to be 
determined. 



Cloning .nd Analysis of the Insertion Site from Un transformed Plant, 

A lambda DASH genomic library wes prepared from DNA of 
untransformed M tabacum SRI plants by Stratagene for cloning of the 
insertion site corresponding to the gene fusion in plant T218. The 
screening of 500.000 plaques with probe #2 (Figure 2) yielded a single 
lambda clone. The EccRI and Xbal fragments were subcloned in pGEM- 
4Z to generate pIS-1 and pIS-2. Figure 7 shows these two overlapping 
subclones, pIS-1 (3.0 kb) and pIS-2 (1.1 kb). which contain tobacco DNa 
spanning the insertion site (marked with a vertical arrow). DNA sequence 
analysis (using dideoxy nucleotides in both directions) revealed that the 
clones. pT218 and pIS-1. were identical over a length of more than ZS kb. 
from the insertion site to their 5' ends, except for a 12 bp filler DNA 
insert of unknown origin at the T-DNA border (Figure 6 and data not 
shown). The presence of filler DNA is a common feature of T-DNA/plant 
DNA junctions (Gheysen et al^ 1991, Gene 94. 155-163). Gross 
rearrangements that sometimes accompany T-DNA insertions (Gheysen et 
«f, Gene 94. 155-153; and 1991, Genes Dev. 5. 287-297) were not 



found (Figure 6) and therefore could not account for the promoter activity 
associated with this region. The region of pIS-1 and pIS-2, 3' of the 
insertion site is also very AT-rich (Figure 7). 

To determine whether there was a gene associated with the pT218 
promoter, more than 3 J kb of sequence contained with pIS-1 and pIS-2 
was analyzed for the presence of long open reading frames (ORFs). 
However, none were detected in this region (d3ta not shown). To 
determine whether the region surrounding the insertion site was 
transcribed in un transformed plants, Northern blots were performed with 
RNA from leaf, stem, root, flo-ver and seeds at 4, 8, 12, I4 t 16, 20 and 24 
dpa. Total RNA from leaves was isolated as described in Ouellet et ai. t 
(1992, Plant / 2, 321-330). To isolate total RNA from developing seeds, 
0.5 g of frozen tissue was pulverized by grinding with dry ice using a 
mortar and pestle. The powder was homogenized in a 50 ml conical tube 
containing 5 ml of buffer (1 M Tris HC1, pH 9.0, 1% SDS) using a 
Polytron homogenizes After two extractions with equal volumes of 
phenol:chloroform:isoamyl alcohol (25:24:1), nucleic acids were collected 
by ethanol precipitation and resuspended in water. The RNA was 
precipitated overnight in 2M LiCI at 0°C, collected by centrifugaticn, 
washed in 10% ethanol and resuspended in water. Northern blot 
hybridization was performed as described in Gottlob-McHugh et al. (1992, 
Plant PhysioL 100, 820-825). Probe #3 (Figure 2) which spans the entire 
region of pT218 5' of the insertion did not detect hybridizing RNA bands 
(data not shown). To extend the sensitivity of RNA detection and to 
include the region 3* of the insertion site within the analysis, RNase 
protection assays were performed with 10 different RNA probes that 
spanned both strands of pIS-1 and pIS-2 (Figure 7). Even aftei lengthy 
exposures, protected fragments could not be detected with RNA from 8, 
10, 12 dpa seeds or leaves of untransform^d plants (see Figure 5 for 
examples with two of the probes tested). The specific conditions used 
allowed the resolution of protected RNA fragments as small as 10 bases 
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(data not shown). Failure to detect protected fragments was not due to 
problems of RNA quality, as control experiments using the same samples 
detected acetohydroxyacid synthase (/WAS) SURA and SURB mRNA 
which are expressed at relatively low abundance (data not shown). 
Conditions used in the present work were estimated to be sensitive enough 
to detect low-abundance messages representing 0.001-0.01% of total 
mRNA levels (Ouellet et a/, 1992. Plant J. 2, 321-330). Therefore, the 
region flanking the site of T-DNA insertion does not appear to be 
transcribed in untransformed plants. 



Genomic Origins of the Insertion Site 

Southern blots were performed to determine if the insertion site is 
consetved among Nicotiana species. Genomic DNA (5 jig) was isolated, 
digested and separated by agarose gel electrophoresis as described above. 
After capillary transfer on to nylon filters, DNA was hybridized, and 
probe; were labeled, essentially as described in Rutledge et al. (1991, MoC 
Gen. Genet 229, 31-40). High-stringency washes were in 0.2 x SSC at 65°C 
while tow-strinrency washes were in 2 x SSC at room temperature. In 
Figure 8, DNA of the allotetraploid species N. tabacum and the 
presumptive progenitor diploid species N. tomentosiformis aud N. sytvestns ^ 
(Okamuro and Goldberg, 1983, MoL Gen. Genet, 198, 290-2VS) were 
hybridized with probe #2 (Figure 2). Single hybridizing fragments of 
identical size were detected in N. tabacum and N. tomentosiformis DNA 
digested with HindUl Xbal and £coRI, but not in N. sylvestris. 
Hybridizations with pIS-2 (Figure 8) which spans the same region but 
includes DNA 3" of the insertion site yielded the same results. Tbey did 
not reveal nybridizing bands, even under conditions of reduced stringency, 
in additional Nicotiana species including N. rustica, N. g'mnosu, N. 
meaatasiphon and N. debnsyi (data not shown). Probe #3 (Figure 2) 
revealed the presence of moderately repetitive DNA specific to the N. 
tomentosiformis genome (data not shown). These results suggest that the 
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region flanking the insertion site is unique to th Af. tomentosiformis 
genome and is not conserved among related species as might be expected 
for regions that encode essential genes. 

All scientific publications and patent documents are incorporated 
herein by reference. 

The present invention has been described with regard to preferred 
embodiments. However, it will be obvious to persons skilled in the art 
that a number of variations and modifications can be made without 
departing form the scope of the invention as described in the following 
claims. 
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THE EMBODIMENTS OF THE INVENTION IN WHICH AN 
EXCLUSIVE PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED 
AS FOLLOWS: 



1. A seed coat-specific cryptic promoter from tobacco. 

2. The promoter of claim 1, contained within a DNA sequence, or 
analogue thereof, as shown in Figure 6. 

3. The promoter of claim 2, contained within a DNA sequence, or 
analogue thereof, from nucleotide 1 to nucleotide 467 as shown in Figure 
6. 



4. A DNA sequence, or analogue thereof, as shown in Figure 6, 
wherein said DNA sequence, or analog thereof, codes for a seed coat- 
specific promoter. 

5. The sequence of claim 4, or analogue thereof, from nucleotide 1 to 
nucleotide 467 as shown in Figure 6. 

6. A cloning vector which comprises a gene encoding a protein and a 
seed coat-specific cryptic promoter from tobacco, wherein the gene is 
under the control of the promoter and is capable of being expressed in a 
plant cell transformed with the vector. 



7. The vector of claim 6, wherein the seed coat-specific promoter is 
contained within a DNA sequence, or analogue thereof, as shown in 
Figure 6. 

8. The vector of claim 7, wherein the seed coat-specific promoter is 
contained within a DNA sequence, or analogue thereof, from nucleotide 1 
to nucleotide 467 as shown in Figure 6. 
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9. A plant cell which has been transformed with a vector as claimed in 
claim 6. 

10. A plant cell which has been transformed with a vector as claimed in 
claim 7. 

11. A plant cell which has been transformed with a vector as claimed in 
claim 8. 

12. A transgenic plant containing a promoter as claimed in claim 1, 
operatively linked to a gene encoding a protein. 

13. A transgenic plant containing a promoter as claimed in claim 2, 
operatively linked to a gene encoding a protein. 

14. A transgenic plant containing a promoter as claimed in claim 3, 
operatively linked to a gene encoding a protein. 
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Xbd 1 r 1 — » 

1 TCTAGACTTGTCTTTTCTTTACATAATCCTCTTCTTCTTTTTTTTGTTAGTTTCTTCTGT 



61 TTTATCCAAAAAACGAATTATTGATTAAGAAATACACCAGACAAGTTTTTTACTTCTT1T 

121 TCTTTTTTTTTTTGTGGTAAAAAATTACACCTGGACAAGTTTATCACGAAAAT 

P 

181 GCTATTTAAGGGATGTAGTTCCGGACTATTTGGAAGATAAGTGTTAACAAAATAAATAAA 



rrTrTATAACAGTCATCCTTATTTATAACAATACTTT 



241 TAAAAAGTTTATACAGTTAGATCTCTCTATAACAGTCATCCTTATTTATAACAA 

301 ACTATAACCGTCAAATTTATTTTGAAACAAAATTTTCATG[TATGTTACTATAACAGTAT 

351 TTTATTATAGCAACC^AAAAATATCGAAACAGATACGATTGTTATAGAGCGAT 

SndBl 

*2i TATCATTATCCACATATTTTCGTAAGCCCAATTACTCCTCCTACGTACGA^ 

481 CCA^TTTAAAGTTGCAAAAATCCAATAGATTTCAATACTTCTTCAACTGGC 

541 GGTAATGACTCCTTTTTAACTTTTCATCTTTAA^TTG^ 

^-Xbal ► H>al ► 

601 TTTCTAGAAGAG/AGTGTTTTAACAC7TCTAGCTCTACTATTATCTGTGTTTCTAGAAGA 

"' " " 

661 ^AAATAGAAAATGTGTCCACCTCAAAAACAACTAAAGGTGGKAA&SJfiCACCTATTTA 



721 TTTTATTTTGGATTAATTAAGATATAGTAAAGATCA(^ 

781 TAf » ffGAATTTTAAGi ATGTGTACCGATTTAACTTTATTTACATTTATGTTTCGCACATA 

TA TA I 
841 TAAGAAGTCCGATTTGGAAATACTAGATpTGTCAATCAGGCAATTCATGTGGT 

901 ATTTAAGTTATATACAATGATGATATAAAGAATTTTTATACTATTAGTGCAAAT 

957 ATTACTAAAAATTATTATTCTATTAATTTATGCTATC I X^^T EIScXT Y^! 

1005 GCGGTACCCGGTGGTCAGTCCCTT ATG T£A CjjT CjJT GTA GAA ACC CjjA ACC 
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